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Attorney Docket No. 9882-012 

A NEW TYROSINE RECOMBINASE FOR 
GENETIC ENGINEERING 



1. INTRODUCTION 

The present invention is directed to methods and compositions for Tnpl-mediated 
genetic engineering using the Bacillus thuringiensis recombinase Tnpl and the Tnpl 
jQ recombination substrates TRT or TRT', and variants thereof. In particular, the invention 
relates to TRT or TRT' sequences, and variants thereof, vectors, cells and kits useful for 
Tnpl-mediated genetic recombination, as well as methods for the use of TRT or TRT' 
sequences for Tnpl-mediated genetic engineering. 

|[ J 5 2. BACKGROUND OF THE INVENTION 

C! Tnpl is a site-specific recombinase (SSR) encoded by the transposon Tn4430, a 

I; icj 

member of the TN3 family from Bacillus thuringiensis. Tnpl belongs to the integrase 
l^j family of SSRs, also referred to as the tyrosine recombinase family, and, as such, has 

' structural similarities to other members of the integrase family of SSRs such as A. hit 

ip 20 (Mahillon and Lereclus, 1988, EMBO J. 7:1515-26). It has been reported that Tnpl 

¥ ; mediates recombination of DNA molecules containing two relatively long DNA regions 

Q taken from Tn4430 in Bacillus thuringiensis (Salamitou et al,, 1997, Gene 202:121-26; 

Q Sanchis et aL^ 1997, Appl. Environ. Microbiol. 63:779-84), and it was postulated to fixrther 

require a host-derived factor (Mahillon and Lereclus, supra), 
2^ Two site- specific recombinases of the tyrosine class, Cre recombinase from the 

Escherichia coli phage PI, and Flp recombinase from the Saccharomyces cerevisiae 2 
micron episome, have opened a new dimension in the art of intentional genetic engineering 
in higher eukaryotes(Kilby era/., 1993, Trends Genet. 9:413-21). Both recombinases 
recognize specific 34 bp recombination target sequences, called RTs. In the presence of 
2Q active recombinase protein, only two corresponding RTs are required for recombination. 
The utility of these two recombinases lies in the fact that in both purified reactions in vitro 
and in living systems, other trans-acting factors or DNA sequence elements are 
vmnecessary. By selected disposition of RTs and expression of the corresponding 
recombinase, precise genetic rearrangements can be effected in living cells. 
2^ Although very usefiil, Cre and Flp recombinases may not be suitable for many 

genetic engineering applications. For example, Flp originates from yeast and consequently 
has a thermal optimimi of enzyme activity at 30°C. However, it is not very efficient at 
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37°C (Buchholz et a/., 1996, Nucleic Acids Res. 24:4256-62), thereby limiting its 
applicability in genetic engineering to those hosts that grow in the appropriate temperature 
range. In addition, recombination reactions catalyzed by Cre and Flp are reversible, giving 
rise to partially rearranged DNA molecules. Other biochemical characteristics also impinge 
upon recombination efficiency, again potentially limiting host range (Ringrose et aL, 1998, 
J. Mol. Biol. 284:363-84). Thus it is important to identify new recombinases so that the 
genetic engineer can choose an optimal recombinase for use in the desired host organism. 

Furthermore, there is great potential in genetic engineering for use of two, or more, 
recombinases in concert (Meyers et a/., 1998, Nat. Genet. 18:136-41). However this 
potential has not been developed, mainly due to the absence of a suitable combination of 
recombinases that are efficient in a given host. Although there are now more than two 
hundred candidate tyrosine recombinases in the databases, it is not possible to predict which 
candidates will be useful for genetic engineering by protein sequence similarity alone. 
Many of these greater than 200 candidates probably work in multi-protein complexes and 
require auxiliary factors for efficient recombination. Furthermore, in many cases, their RTs 
are not obvious and may not be short. 

Thus there is a continuing need to identify new recombinases that work efficiently 
on short recombination targets and do not require auxiliary factors. 

Citation of a reference herein shall not be construed as an admission that such is 
prior art to the present invention. 

3. SUMMARY OF THE INVENTION 

The present invention relates to compositions and methods for site-specific 
recombination using the recombinase Tnpl, together with Tnpl recombination target sites, 
either in a cell or in a cell-free system. The invention encompasses methods for in vivo and 
in vitro Tnpl-mediated site-specific recombination, nucleic acid molecules, host cells and 
vectors that may be employed in such methods, and kits for use with such methods, nucleic 
acid molecules, host cells, and vectors. 

The present invention is based on the discovery by the inventors of novel properties 
of Tnpl and Tnpl recombination target sites which are useful for genetic engineering. First, 
Tnpl mediates recombination at two functionally distinct Tnpl recombination target sites, 
herein identified as TRT and TRT'. TRT, the minimum *core' Tnpl recombination target 
site consists of a 32bp recombination target (RT) site, comprising two inverted repeat 
sequences. The larger TRT* is a 1 16 bp sequence comprising the core TRT as well as two 
direct repeat sequences. Second, Tnpl mediated site-specific recombination using TRT and 
TRT' substrates can be accomplished both in vitro and in vivo, without requiring any 
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auxiliary factors. These previously unknown abilities of Tnpl and its recombination 
substrates (referred to herein as the TnpI/TRT system) result in useful properties for genetic 
engineering. 

In one embodiment, the invention provides a composition comprising an isolated 

5 DNA molecule comprising one or more copies of TRT (SEQ ID NO:3) or a functional 
variant thereof, with the proviso that the DNA molecule does not comprise the entire 
sequence of TRT" (SEQ ID NO:4). In a specific embodiment, the DNA molecule 
comprises one or more copies of TRT (SEQ ID NO:3), or a functional variant thereof, and a 
heterologous nucleotide sequence. In other specific embodiment, the DNA molecule does 

10 not comprise more than 32, 50, 100, 150, or 200 contiguous nucleotides of the sequence of 
TRT" (SEQ ID NO:4). In another embodiment, the DNA molecule further comprises a 
selectable marker. In another embodiment, the DNA molecule is a vector. 

In another embodiment, the invention provides a composition comprising an isolated 
DNA molecule comprising one or more copies of TRT' (SEQ ID NO:2), or a functional 

1 5 variant thereof, with the proviso that the DNA molecule does not comprise the entire 

sequence of TRT" (SEQ ID NO:l). In a specific embodiment, the DNA molecule comprises 
one or more copies of TRT' (SEQ ID NO:2), or a functional variant thereof, and a 
heterologous nucleotide sequence. In various specific embodiments, the DNA molecule 
comprises one or more copies of TRT' (SEQ ID NO:2), or a functional variant thereof, with 

20 the proviso that the DNA molecule does not comprise more than 116, 125, 150, 175, or 200 
contiguous nucleotides of the sequence of TRT" (SEQ ID NO:4). In another embodiment, 
the DNA molecule further comprises a selectable marker. In another embodiment, the DNA 
molecule is a vector. 

In another embodiment, the invention provides a cell transformed with a DNA 

25 molecule, said DNA molecule comprising one or more copies of TRT (SEQ ID NO: 3), or a 
functional variant thereof, with the proviso that the DNA molecule does not comprise the 
entire sequence of TRT" (SEQ ID NO:4). In another embodiment, the invention further 
provides a cell transformed with a DNA molecule, said DNA molecule comprising one or 
more copies of TRT' (SEQ ID NO:2), or a functional variant thereof, with the proviso that 

30 the DNA molecule does not comprise the entire sequence of TRT" (SEQ ID NO:4). In 
specific embodiments, the DNA molecule is integrated into the chromosome of the cell. 

The invention fiirther provides a eukaryotic cell transformed with a DNA molecule 
integrated into its chromosome, said DNA molecule comprising one or more copies of TRT 
(SEQ ID NO:3) or a functional variant thereof, or TRT' (SEQ ID NO:2), or a functional 

35 variant thereof In a specific embodiment, the cell is a mouse embryonic stem cell. In a 

specific embodiment, the DNA molecule comprises two copies of TRT' (SEQ ID NO:3), or a 
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functional variant thereof, separated by a heterologous nucleotide sequence. In another 
embodiment, the DNA molecule comprises two copies of TRT (SEQ ID NO:2), or a 
functional variant thereof, separated by a heterologous nucleotide sequence. 

The invention further provides kits for use of the Tnpl/Trt system for genetic 
engineering. In one embodiment, a kit comprising in one or more containers: a) an isolated 
DNA molecule comprising one or more copies of TRT (SEQ ID NO:3) or a functional variant 
thereof; and b) an isolated Tnpl protein, a Tnpl expression vector or a cell capable of 
expressing Tnpl. In another embodiment, a kit is provided comprising in one or more 
containers: a) an isolated DNA molecule comprising one or more copies of TRT (SEQ ID 
NO:2) or a functional variant thereof; and b) an isolated Tnpl protein, a Tnpl expression 
vector or a cell capable of expressing Tnpl. 

The invention further provides methods for use of the Tnpl/Trt system for genetic 
engineering. In one embodiment, a method is provided for effecting Tnpl-mediated site- 
specific recombination comprising exposing a first Tnpl recombination target site and a 
second Tnpl recombination target site with Tnpl protein, under sufficient conditions and in 
an amount sufficient to mediate site-specific recombination between the first and second 
Tnpl recombination target sites, wherein the first and second Tnpl recombination target 
sites are selected fi-om the group consisting of TRT (SEQ ID NO:3) or a functional variant 
thereof, and TRT' (SEQ ID NO:2) or a functional variant thereof, with the proviso that the 
DNA molecule does not comprise the entire sequence of TRT" (SEQ ID NO:4). 

In a specific embodiment of this method, site-specific recombination occurs in vitro. 
In another specific embodiment, the site-specific recombination occurs in a cell. In one 
embodiment, the cell is a non-Bacillus thuringiensis cell. In another embodiment, the cell 
is not a Gram positive cell. In another embodiment, the first Tnpl recombination target site 
and the second Tnpl recombination target site are on different DNA molecules. In yet 
another embodiment, the first Tnpl recombination target site and the second Tnpl 
recombination target site are on the same DNA molecule. In one embodiment of this 
method, the DNA molecule is chromosomal DNA. In another embodiment, the first Tnpl 
recombination target site and the second Tnpl recombination target site are in direct 
orientation. In another embodiment, the DNA molecule further comprises one or more non- 
Tnpl site-specific recombination target sites. In another embodiment, at least one of the 
one or more non-TnpI sites is a Cre recombinase target site. In another embodiment, at 
least one of the one or more non-TnpI sites is a Flp recombinase target site. In another 
embodiment, the DNA molecule comprises, in the following order, firom 5* to 3', the first 
Tnpl recombination target site, a heterologous nucleotide sequence, and the second Tnpl 
recombination target site. 
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The invention further provides a method for effecting site-specific recombination in 
a non-Bacillus thuringiensis cell comprising exposing a first Tnpl recombination target site 
and a second Tnpl recombination target site to Tnpl protein, under sufficient conditions and 
in an amount sufficient to mediate site-specific recombination between the first Tnpl 
5 recombination target site and second Tnpl recombination target site, wherein the first and 
second Tnpl recombination target sites are selected fi'om the group consisting of TRT (SEQ 
ID NO:3) or a functional variant thereof, and TRT* (SEQ ID NO:2) or a functional variant 
thereof. In one embodiment, this method further comprising, before said exposing, a step 
of introducing into the cell a first Tnpl recombination target site and a second Tnpl 
10 recombination target site. In another embodiment, the first Tnpl recombination target site 
and the second Tnpl recombination target site are on different DNA molecules. In an 
alternative embodiment, the first Tnpl recombination target site and the second Tnpl 
^ recombination target site are on the same DNA molecule. In another embodiment, the first 

Ji Tnpl recombination target site and the second Tnpl recombination target site are in direct 

fc; 15 orientation. In another embodiment, the DNA molecule further comprises one or more non- 

^ Tnpl site-specific recombination target sites. In yet another embodiment, at least one of the 

one or more non-TnpI sites is a Cre recombinase target site. In another embodiment, at 
least one of the one or more non-TnpI sites is a Flp recombinase target site. In a specific 
embodiment, the DNA molecule further comprises a heterologous nucleotide sequence, in 
20 the following order, firom 5' to 3\ the first Tnpl recombination target site, a heterologous 
IUI nucleotide sequence, and the second Tnpl recombination target site, such that 

recombination between the first and the second Tnpl recombination target site results in 
jU deletion of the heterologous nucleotide sequence. In another specific embodiment, the first 

Tnpl recombination target site and the second Tnpl recombination target site are in inverse 
25 orientation. In another specific embodiment, the cell contains a sequence encoding Tnpl 
operably linked to a promoter. In another embodiment, the promoter is an inducible or 
tissue-specific promoter. In another specific embodiment, the cell is a eukaryotic cell, e.g., 
a mouse cell or an embryonic stem cell. In another specific embodiment, the first Tnpl 
recombination target site and the second Tnpl recombination target site are both TRT* 
30 sequences (SEQ ID NO:2) or functional variants thereof. In another embodiment, the first 
Tnpl recombination target site and the second Tnpl recombination target site are both TRT 
sequences (SEQ ID NO:3) or functional variants thereof. 

The invention further provides a method of producing a circular DNA vaccine 
comprising: (a) introducing a DNA molecule into a non-Bacillus thuringiensis cell, said 
35 DNA molecule comprising, in the following order, fi-om 5* to 3*, a DNA sequence encoding 
an antigen of interest, a first TRT* site or functional variant thereof, an origin of replication 
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and, optionally, one or more selectable markers, and a second TRT' site or functional 
variant thereof; and (b) contacting said cell with Tnpl protein under sufficient conditions 
and in an amount sufficient to mediate site-specific recombination between the first and 
second TRT' sites or fiinctional variants thereof, such that recombination between the first 

5 and the second TRT' sites or fimctional variants thereof results in deletion of the origin of 
replication and the optional one or more selectable markers fi-om the DNA molecule, such 
that a circular DNA vaccine encoding an antigen of interest is produced. 

As used herein, the term "functional variant" of a TRT or TRT' sequence refers to a 
TRT or TRT' sequence with one or more altered nucleotide residues, such that the TRT or 

10 TRT' fimctional variant, when contained in a DNA molecule with a TRT or TRT', is 
capable of acting as a substrate for Tnpl-mediated recombination, as assayed by an in vivo 
or an in vitro Tnpl-mediated recombination assay. As used herein, the term "heterologous 
nucleotide sequence" refers to a non-Tn4430 nucleotide sequence. 
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4. DESCRIPTION OF THE FIGURES 



sequence 



Figure 1 A-C. Tnpl recognition target sequences. 

IA. TRT" [SEQ ID NO:l], a 244 bp fi-agment containing a direct repeat of the scqucj 
20 AAAATCAGA, an inverted repeat of the sequence TAATACAACACAAT, and a direct 

repeat of the sequence ACGCAACACAATTTAT. 

IB. TRT' [SEQ ID NO:2], a 1 16 bp firagment containing an inverted repeat of the 
sequence TAATACAACACAAT and a direct repeat of the sequence 
ACGCAACACAATTTAT. 

25 IC. TRT [SEQ ID NO:3], a 32 bp firagment containing an inverted repeat of the 
sequence TAATACAACACAAT. 

Figure 2A-B. Sequence and structure of the transposon Tn4430 

IA. A schematic diagram of the transposon T:n4430, with its 38 bp terminal inverted 
30 repeats (IR) represented by triangles and the position of the Tnpl and TnpA (transposase) 

coding regions depicted as grey arrows. 

IB. Sequence of the 249 bp TRT" [SEQ ID NO:4] indicating the following: the Tn4430 
left IR (first shaded box); Tnpl binding regions (open boxes), within which is shown the 
inverted repeats (black arrows), and the conserved 9 bp sequence (5'-CAACACAAT-3') 

35 common to all four Tnpl binding sites; the direct repeats (grey arrows); and the Tnpl 
translational start site (second shaded box). 
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Figure 3. The coding nucleotide sequence [SEQ ID NO:5] and the deduced amino acid 
sequence [SEQ ID NO:6] of Tnpl". 

5 Figure 4. Tnpl expression vector pYZ-BAD-TnpI, comprising a pl5A repUcation origin, a 
kanamycin selection marker, arabinose-inducible promoter Pb^d. ^dAraC, the arabinose- 
responsive repressor that regulates Pbad- 

Figure 5. Tnpl-mediated recombination in E. coli. 
1 0 5 A. Schematic representation of Tnpl recombination substrate and products. The 
circular DNA starting material substrate (S) containing two directly repeated Tnpl 
recombination target sites (arrows) and an origin of replication (filled circles). 
Recombination between the two Tnpl recombination target sites produces two circular 
products (P), only one of which contains the origin of replication and is therefore capable of 

1 5 repUcating in the cell. 

SB. Agarose gel of products of Tnpl-mediated recombination in E. coli. Plasmids 
contained either TRT or TRT' grown in E coli, in either the presence (+) or absence (-) of 
Tnpl expression. Lane 1 : size marker. Lane 4: Plasmid containing TRT'-flanked sequence 
in the absence of Tnpl. Lane 5: TRT'-containing plasmid recombination products after 

20 recombination by Tnpl (excised fi-agment lost during cell division). Lane 6: Plasmid 

containing TRT-flanked sequence in the absence of Tnpl. Lane 7: TRT-containing plasmid 
recombination products after recombination by Tnpl. (Lanes 2 and 3 are irrelevant .) Note 
high-molecular weight products not seen in the TRT'-containing plasmid. Positions of 
DNA fragments corresponding to the starting material substrate (S) and recombination 

25 products (P) are indicated to the left of gel. 

Figure 6. Overexpression and purification of Tnpl. 

6 A. Schematic representation of the relevant portions of the pGIV007 and pGIVOOS 
expression vectors showing the pBAD::His-tag::^«p/and the pBAD::m/7/ genes, 
30 respectively. Positions of the start codons (ATG), the ribosome binding sites (RBS), and 
the restriction sites Ncol are indicated. 

6B. SDS-polyacrylamide gel electrophoresis showing the elution profile of the H-Tnpl 
(top) and the Tnpl (bottom) proteins from the nickel resin. The different lanes correspond to 
fractions that eluted at increased concentrations of imidazole as indicated. Arrowheads 
35 show the position of the H-Tnpl and Tnpl proteins. 
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Figure 7. Tnpl-mediated recombination in vitro. 

7A. Schematic representation of the circular DNA substrate carrying directly repeated 
TRT (pGIV016) or TRT' (pGIV014) recombination sites (arrows). Recombination gives 
two circular products (PI and P2) that can be distinguished from the substrate by restriction 

5 with//i«rfin(Hdm). 

7B. Agarose gel (0.8%) of recombination reactions in the presence (+) or absence (-) of 
H-Tnpl using, as substrates, plasmids carrying two copies of TRT (pGIV016, left panel) or 
TRT' (pGIVOM, right panel). Positions of DNA fragments corresponding to the substrates 
(SI and S2) and recombination products (PI and P2) are indicated. 

10 7C. Kinetics of recombination at TRT (squares) and TRT' (diamonds). 

Figure 8. Topology of TRT and TRT' recombination reactions. Plasmid substrates 
incubated in the presence (+) or absence (-) of H-Tnpl, and then treated in the presence (+) 
or absence (-) of DNase I, as indicated. 
15 8A. Agarose gel (0.7 %) of TRT recombination products, yielding a DNase I ladder of 
catenanes containing 2, 4, 6, etc. nodes (shown to the right). Faint bands correspond to 
knotted products arising from multiple rounds of recombination. 

8B. Agarose gel (0.7 %) of TRT' recombination products yielding a unique 2-noded 
catenane product. 

20 

Figure 9. Representation of a Tnpl -TRT' complex. "Core" = TRT. "DRl" and"DR2" = 
direct repeat 1 and direct repeat 2. 

Figure 10. Vector pSVpaT, comprising the SV40 promoter Psv4o. a puromycin resistance 
25 gene puro flanked by two 32 bp TRT sequences, the p-galactosidase gene LacZ, and a 
chloramphenicol resistance gene cm. pA indicates a polyadenylation signal. 

Figure 11. Plasmid pCAGGS/TnpI, used to express Tnpl in eukaryotic cells, comprises 
the constitutive chicken beta-actm promoter Pcba, the rabbit beta-globin polyA- encoding 
30 sequence RBG pA, and splicing acceptor site sA. 

Figure 12. Tnpl-mediated recombination in eukaryotic cells. Diagram of the mouse ES 
reporter cell system cassette (top panel), containing a puromycin resistance gene, puro, 
flanked by TRT recombination target sites, and a non-expressed lacZ gene. Upon 
35 expression of Tnpl from the plasmid pCAGGS/TnpI, the puromycin resistance gene is 
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excised and the lacZ gene placed under the control of the Psv4o promoter and expressed 
(bottom panel). 

Figure 13. LacZ assay of mouse ES reporter cells expressing Tnpl from pCAGGS/TnpI. 
5 LacZ is expressed only when active Tnpl is expressed within the ES reporter cell lines, as 
indicated by the dark (blue) cells. 



5. DETAILED DESCRIPTION OF THE INVENTION 

^ ^ The present invention is directed to compositions and methods for Tnpl 

recombinase-mediated genetic engineering. As discussed above, Tnpl is a site-specific 
recombinase (SSR) encoded by the B. thuringiensis transposon 1^4430 (MahiUon and 
Lereclus, supra). Although previously thought to require a host-derived factor for 
functionaUty, as demonstrated herein, Tnpl requires only its specific recognition sequence 
^ ^ contained in Tn4430 to mediate recombination. 

The invention is based on the discovery by the inventors of the following novel 
properties of Tnpl and its recombination target sites. First, Tnpl mediates recombination at 
two fimctionally distinct Tnpl recombination target sites, herein identified as TRT and 
TRT'. TRT, a 32bp recombination target (RT) site (shown in Figure IC and described in 
detail below), includes two inverted repeat sequences, the minimum 'core' recombination 
site. The larger TRT' is a 1 16 bp sequence, comprising the core TRT as well as two direct 
repeat sequences (shown in Figure IB and described in detail below). As described more 
fiiUy hereinbelow, both the recombination outcome and efficiency at these sites are 
different. For example, recombination between two TRT sequences is 'unconstrained' (or 
'relaxed') because it occurs freely between recombination sites, giving rise to all possible 
inter- and intramolecular DNA rearrangements in vivo, and a variety of topological products 
in vitro. In contrast, the direct repeats of the larger TRT' constrain recombination by 
preventing intermolecular recombination events and favoring intramolecular deletion 
reactions between directly repeated sites. This 'constrained', unidirectional recombination 
^® leads to the excision of sequences lying between the two TRT' sites. This previously 

unknown ability of the Tnpl recombinase to mediate recombination reactions with different 
outcomes as a fimction of the structure of its recombination site is an unprecedented and 
usefiil feature for genetic engineering. 

Second, Tnpl mediated site-specific recombination using TRT and TRT' substrates 
can be accompUshed both in vitro and in vivo, without requiring any auxiliary factors. 
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Thus, for example, Tnpl mediated site-specific recombination can be used in eukaryotic 
cells, such as, for example, mouse embryonic stem cells ("ES cells"). 

Described herein are compositions and methods relating to the use of Tnpl with 
these two Tnpl recombination targets (also referred to herein as the TnpIATRT system). The 

5 present invention relates to methods for the use of the TnpI/TRT system in eukaryotic or 
prokaryotic, non-fi. thuringiensis hosts, the use of TRT and TRT' in conjunction with 
expressed Tnpl to accomplish genetic rearrangements, including excisions firom vectors and 
fi-om cellular DNA, vectors and combinations of vectors to accomplish those excisions, and 
kits utilizing those vectors, cells, and methods. In particular. Section 5.1 describes 

10 compositions of the invention, including DNA constructs designed for site-specific 
recombination events using Tnpl and a recombination target, and kits comprising such 
constructs. Section 5.2 describes methods for use of the invention, including methods for 
manipulation, excision, and integration of genes of interest into prokaryotic or eukaryotic 
host genomic DNA using Tnpl and various genetic constructs that include a recombination 

15 target. 

5.1 rnMPOsmoNS for tnpt-mfdiated sttr-speciftc rfcombination 

Compositions comprising Tnpl recombination target sequences usefixl for genetic 
engineering, including TRT', TRT, and sequence variants thereof, are described in detail 
20 herein. 

First, the invention encompasses the minimum core recombination site TRT, 

5'- TAATACAACACAATr ATTAIATTGTGTTGTATTA 
[SEQ ID NO:3], 

25 a 32 bp sequence comprising a pair of inverted repeats (underlined), also shown in Figure 
IC. Recombination occurs by a Tnpl staggered cleavage at symmetrical positions within 
this sequence, between the last two nucleotides at the 3' end of the repeat {i.e., CACAA/T), 
resulting in a 6 bp central sequence, TATTAA), referred to herein as the 'central crossover 
region'. 

The invention further encompasses the larger TRT', a 1 1 6 bp sequence comprising a 
the core TRT site and a pair of direct repeats (bolded): 

5'- T A AT AG AACACAAT FATTAI ATTGTGTTGTATTAGGTG 
TTATAATAAATATAAATCTAGGGGTTTAACGCAACA 
CAATTTATCGATAAATAAATACTTTTAGACGCAACA 

CAATTTAT [SEQ ID NO:2], 
which is depicted in Figure IB. 
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Both TRT and TRT' are sequences contained entirely within the Tn4430 left IR 
shown in Figure 1 A (referred to herein as TRT"). In one embodiment, the invention further 
encompasses a DNA molecule comprising one or more copies of the TRT site, providing 
the DNA molecule does not comprise the entire sequence of TRT". In a preferred 
5 embodiment, the DNA molecule consists of the TRT site and additional nucleotide 

sequences, providing the additional sequence does not comprise more than 216 contiguous 
nucleotides of TRT". Preferably, the additional nucleotide sequence comprises not more 
than 2, 5, 10, 20, 30, 40, 50, 100, 150, or 200 contiguous nucleotides of TRT". 

In another embodiment, the invention fiirther encompasses a DNA molecule 

10 comprising one or more copies of the TRT' site, providing the DNA molecule does not 
comprise the entire sequence of TRT". In a preferred embodiment, the DNA molecule 
consists of the TRT' site and additional nucleotide sequences, providing the additional 
sequence does not comprise more than 132 contiguous nucleotides of TRT". Preferably, the 
additional nucleotide sequence comprises not more than 2, 5, 10, 20, 30, 40, 50, or 100 

1 5 contiguous nucleotides of TRT". 

In a preferred embodiment, the invention encompasses a composition comprising a 
DNA molecule consisting of one or more copies of TRT and a heterologous nucleotide 
sequence. In another preferred embodiment, such a composition comprises a DNA 
molecule consisting of one or more copies of TRT' and a heterologous nucleotide sequence. 

20 As used herein, the term "heterologous nucleotide sequence" refers to a non-Tn4430 

nucleotide sequence. 

The invention further encompasses functional variants of TRT or TRT' sequences. 
As used herein, the term "functional variant" of a TRT or TRT' sequence refers to a TRT or 
TRT sequence with one or more altered nucleotide residues, such that the TRT or TRT' 

25 fimctional variant, when contained in a DNA molecule with a TRT or TRT', is capable of 
acting as a substrate for Tnpl-mediated recombination, as assayed by an in vivo or an in 
vitro Tnpl-mediated recombination assay. Such a TRT or TRT' functional variant sequence 
may be constructed by modifying the identity of one or more nucleotides between or within 
the core TRT site, or in the case of a variant TRT' sequence, within the accessory direct 

30 repeats or the TRT' spacer region between the core TRT site and the first accessory direct 
repeat (referred to herein as the "spacer region"). Such functional variants may be single 
point mutations, double point mutations, or multiple point mutations, insertions and/or 
deletions, and may be constructed using routine methods well known in the art (see, e.g., 
Sambrook et al, 1989, Molecular Cloning - A Laboratory Manual, 2nd Edition, Cold 

35 Spring Harbor Press, New York). 
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For example, in various embodiments, sequences which may be altered include, but 
are not limited to, the Tnpl binding sites, of the TRT central crossover region, or sequences 
within the TRT' spacer region. Such functional variant Tnpl recombination target 
sequences may then be tested for Tnpl binding and recombination activity in vivo or in 

5 vitro, in the various assays described in detail herein. In a preferred embodiment, such 
assays are used to select for functional variant Tnpl recombination target sequences with 
increased binding affinity and/or recombination activity. 

In one embodiment, for example, a TRT' functional variant is constructed 
comprising one or more altered nucleotide residues within the spacer region, which 

10 preserves the sequence of the core TRT site and the accessory direct repeats themselves. 
Such a functional variant TRT' sequence may then be tested for recombination activity in 
vivo or in vitro, as described herein. For example, in one embodiment, such a functional 
variant TRT' sequence comprises the following sequence: 

5'- T A AT AC A ACACAATr ATTA1 ATTrTTGTTGTATTAGGTG 
15 TTATAATAAATATATATCTAGGGGTTTAACGCAACA 
CAATTTATCGATAAATAAATACTTTTAGACGCAACA 

CAATTTAT [SEQ ID NO:7] 

wherein the altered nucleotide residue is underlined and in bold. 
In another embodiment, a functional variant TRT' sequence comprises the following 
20 sequence: 

5'- T A AT AHA ACACAAT f ATTA1 ATTGTGTTGTATTAGGTG 
TTATAATATATATAAATCTAGGGGTTTAACGCAACA 
CAATTTATCGATAAATAAATACTTTTAGACGCAACA 

CAATTTAT [SEQ ID N0:8], 

25 wherein the altered nucleotide residue is underlined and in bold. 

In general, such a functional variant TRT' sequence may have the following general 
sequence: 

5'- T A AT AG A ACACAAT r ATTA1 ATTGTGTTGTATTArX„lA 
CGCAACACAATTTAT[XJACGCAACACAATTTAT 

30 [SEQ ID NO:9], 

wherein X represents a nucleotide residue. 

In another embodiment, a functional variant TRT' sequence may be constructed by 
modifying the length of the spacer between the core TRT site and the accessory direct 
repeats. For example, the spacer may be modified by the addition of one integral helical 
3^ turn of the DNA helix, i.e., 10 bp, or a multiple thereof, i.e., 21 or 31 bp, for example. In 
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15 



one embodiment, for example, nucleotide sequences having the following consensus 
sequence may be used as Tnpl recombination targets: 

5'- T A AT AC A ACACAATF ATTAI ATTGTGTT GTATTArX.,1A 
CGCAACACAATTTACGATAAATAAATACTTTTAGAC 

5 GCAACACAATTTA [SEQ ID NO:10], 

wherein the underlined nucleotide residues depict the inverted repeats, bolded nucleotide 
residues represent direct repeats, and X«, represents a spacer region sequence comprising an 
additional integral helical turn between the core TRT site and the accessory direct repeats. 

Preferably, the accessory direct repeats are tenderly repeated on a same DNA 
molecule as the core TRT site. In another embodiment, the orientation of the direct repeats 
within the recombination target site may be manipulated. Reorienting the direct repeats can 
result in more complete removal of TRT sequences, which may be advantageous to genetic 
engineering applications. In alternative embodiments, for example, the orientation of the 
direct repeats may be forward, as depicted above, or reversed ("flipped") with respect to the 
inverted repeats. 

The invention fiirther comprises compositions with more than one copy of a Tnpl 
recombination target sequence, such as TRT', TRT, or a functional variant thereof, on a 
single DNA molecule. Preferably, the DNA molecule comprises two Tnpl recombination 
target sequences in tandem, separated by a variable length of heterologous sequence, 
recombination may occur between such Tnpl recombination target sites. In one 
embodiment, such a DNA molecule comprises two Tnpl recombination target sites, such as 
TRT', TRT, or a functional variant thereof, in direct repeat orientation. In another 
embodiment, such a DNA replicon comprises two Tnpl recombination target sites, such as 
TRT', TRT, or a functional variant thereof, in inverted repeat orientation. 

25 

5.1.1 TNPI RECOMBINATION T ARGRT VECTORS 

Vectors comprising Tnpl recombination targets may be constructed. Tnpl 
recombination target sites may be separated from each other by a nucleotide sequence, such 
3Q as a nucleotide sequence encoding a selectable marker, or a nucleotide sequence encoding a 
gene of interest, such that recombination between the target sequence deletes or inverts the 
nucleotide sequence located between the two Tnpl recombination targets. 

In one embodiment, the vector comprises, in order: (1) a promoter; (2) a first TRT'; 
(3) a multiple cloning site (MCS); and (4) a second TRT', wherein the first and second 
TRT's flank the MCS. The MCS includes restriction enzyme cleavage sites that will enable 
the cloning of nucleotide sequences such that those sequences are controlled by the 
promoter. Upon introduction of the vector into a cell, and under the appropriate conditions, 
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the nucleotide sequence cloned into the MCS will be expressed. Expression is terminated 
upon expression of Tnpl in the cell from a different vector. Such a system may be desirable 
in situations in which the first vector contains a constitutive promoter, or when it is not 
feasible to remove the inducing factor regulating an inducible promoter. 
5 In another embodiment, the vector comprises, in order: (1) a promoter; (2) a first 

TRT'; (3) a spacer sequence that prevents the promoter from controlling the expression of 
downstream sequences; (4) a second TRT; and (5) a gene of interest. Upon Tnpl-mediated 
recombination, the spacer sequence is excised, allowing the promoter to control expression 
of the gene of interest. In this embodiment, the promoter may be constitutive, or, 

10 preferably, inducible. 

Such vectors will have the general characteristics outlined in Section 5.1.3, below. 
Vectors carrying recombination targets in such a manner may either be designed to exist as 
episomes, or may be designed to facilitate integration into the host genomic DNA to create 
stable cell lines, e.g., by designing vector to be linearized. Such vectors are known in the 

15 art. 

5.1.2 TNPT EXPRFSSTON VECTORS 

Vectors that express Tnpl can be used as episomes to confer upon the host cell the 
ability to express Tnpl. Such vectors, which may be a prokaryotic or eukaiyotic expression 
20 vectors, will have the general characteristics of vectors as described in Section 5.1.1, above. 
The ability to generate a wide range of expression is advantageous for utilizing the methods 
of the invention. Such expression can be achieved in a constitutive as well as in a regulated, 
or inducible, fashion. A variety of regulatory sequences which allow expression (either 
regulated or constitutive) at a range of different expression levels are well known to those of 
25 skill in the art. 

It may be desirable to express Tnpl as either a native Tnpl or, alternatively, as a 
ftision protein. For example, in one embodiment, Tnpl may be expressed as a fusion 
protein with the Tnpl coding sequence linked to a detectable peptide, such a His-tag fusion 
protein to facilitate detection or purification. In another embodiment, Tnpl may be 
30 expressed as a fiision protein with a peptide which imparts a fimctional group, such as a 
nuclear localization signal, a hormone receptor domain, etc. A variety of peptides useful for 
constructing such fiision proteins, and methods for their construction, are well known in the 
art. 

In a specific embodiment, a vector is used that comprises an inducible promoter 
35 operably linked to a Tnpl-encoding nucleic acid, one or more origins of replication, and, 
optionally, one or more selectable markers {e.g., an antibiotic resistance gene). In another 
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embodiment, a vector is used that comprises a tissue-specific promoter operably linked to a 
Tnpl-encoding nucleic acid, one or more origins of replication, and, optionally, one or more 
selectable markers. 

The chosen vector must be compatible with the vector plasmid described in Section 
5 5.1.1, above. One of skill in the art would readily be aware of the compatibility 

requirements necessary for maintaining multiple plasmids in a single cell. Methods for 
propagation of two or more constructs in procaryotic or eukaryotic cells are well known to 
those of skill in the art. For example, cells containing muhiple replicons can routinely be 
selected for and maintained by utilizing vectors comprising appropriately compatible 
10 origins of replication and independent selection systems (see Miller, 1992, A Short Course 
in Bacterial Genetics, Cold Spring Harbor Laboratory Press, NY, and references therein; 
and Sambrook et al, 1989, supra). 

5.1.3 VECTORS GENERALLY 

1 5 Circular vectors incorporating a Tnpl coding region, or a region to be manipulated 

by Tnpl, may be constructed using standard methods known in the art (see Sambrook et al, 
1989, supra; Ausubel et al. Current Protocols in Molecular Biology, Greene Publishing 
Associates and Wiley Interscience, New York). For example, synthetic or recombinant 
DNA technology may be used. In one embodiment, a Tnpl coding fragment is made by 

20 polymerase chain reaction ("PCR") amplification. In this method, oligonucleotides are 
synthesized to include restriction enzyme sites at their 5' ends, and PCR primer sequences 
complementary to the boundary sequences of a Tnpl coding region at their 3' ends. These 
oligonucleotides are then used as primers in a PCR amplification reaction to amplify the 
Tnpl coding region. This amplified region is then cloned into a vector containing an 

25 equivalent restriction site downstream of a promoter in such a manner that the promoter 
controls expression of the Tnpl coding sequence. In another embodiment, a plasmid may be 
constructed to comprise two appropriately oriented TRTs or TRT s flanking a gene segment 
to be excised or inverted, using standard molecular biology techniques (see e.g.. Methods in 
Enzymology, 1987, Volume 154, Academic Press; Sambrook et al, 1989, supra; and 

30 Ausubel et al, supra). The circular product is then transformed into Escherichia coli for 
amplification to yield large amounts of the vector. 

As Tnpl-mediated recombination may be used in either prokaryotic or eukaryotic 
systems, the choice of vector construction depends upon the cell line or bacterial strain 
under study. For prokaryotic systems, the vector preferably includes an appropriate origin 
35 of replication and one or more selectable markers. For example, in plasmids maintained 
and used in E. coli, examples of appropriate origins of replication would be, without 
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limitation, ColEl-derived origins of replication (Bolivar et al, 1977, Gene 2:95-113; see 
Sambrook et al, 1989, supra), pl5A origins present on plasmids such as pACYC184 
(Chang and Cohen, 1978, J. Bacteriol. 134:1 141-56; see also Miller, 1992, supra, p. 10.4- 
10.1 1), and pSClOl origin. Origins of replication may be selected for high (see Sambrook 
5 et al, 1989, supra; see also Miller, 1992, supra, and references therein), medium (Bolivar et 
al, 1977, Gene 2:95-113; see Sambrook et al, 1989, supra), or low (Chang and Cohen, 
1978, J. Bacteriol. 134:1 141-56; see also Miller, 1992, p. 10.4-10.1 1) copy number per cell 
depending upon the particular application. 

For the selectable marker, preferably antibiotic resistance markers are used, such as 
10 the kanamycin resistance gene from TnPOi (Friedrich and Soriano, 1991, Genes Dev. 
5:1513-1523), or genes that confer resistance to other aminoglycosides (including but not 
limited to dihydrostreptomycin, gentamycin, neomycin, paromycin and streptomycin), the 
TEM-1 p-lactamase gene from TnP, which confers resistance to penicillin (including but not 
limited to ampicillin, carbenicillin, methicillin, penicillin N, penicillin O and penicillin V). 
15 Other selectable genes sequences including, but not limited to gene sequences encoding 
polypeptides which confer zeocin resistance (Hegedus et al 1998, Gene 202:241-249). 
Other antibiotics that can be utilized are genes that confer resistance to amphenicols, such 
as chloramphenicol, for example, the coding sequence for chloramphenicol transacetylase 
(CAT) can be utilized (Eikmanns et al 1991, Gene 102:93-98). As will be appreciated by 
20 one skilled in the art, other non-antibiotic methods to select for maintenance of the plasmid 
may also be used, such as, for example a variety of auxotrophic markers (see Sambrook et 
al, 1989, supra; Ausubel et al, supra). 

Any of the methods previously described for the insertion of DNA fragments into a 
vector may be used to construct expression vectors containing a chimeric gene consisting of 
25 appropriate transcriptional/translational control signals and the protein coding sequences. 
These methods may include in vitro recombinant DNA and synthetic techniques and in vivo 
recombinants (genetic recombination). Expression of nucleic acid sequence encoding a 
Tnpl fragment may be regulated by a second nucleic acid sequence so that the Tnpl protein 
or peptide is expressed in a host transformed with the recombinant DNA molecule. For 
30 example, expression of Tnpl coding sequence may be confroUed by any promoter/enhancer 
element known in the art. Preferably the expression is controlled by an inducible promoter. 
Inducible expression yielding a wide range of expression can be obtained by utilizing a 
variety of inducible regulatory sequences. In one embodiment, for example, the /ac/gene 
and its gratuitous inducer IPTG can be utilized to yield inducible, high levels of expression 
35 of Tnpl when sequences encoding such polypeptides are transcribed via the lacOP 

regulatory sequences. A variety of other inducible promoter systems are well known to 
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those of skill in the art which can also be utilized. Levels of expression from Tnpl 
constracts can also be varied by using promoters of different strengths. 

Other regulated expression systems that can be utilized include but are not limited to, 
the araC promoter which is inducible by arabinose (AraC) (see, e.g., Schleif, 2000, Trends 

5 Genet. 16:559-565), the TET system (Geissendorfer and Hillen, 1990, Appl. Microbiol. 
Biotechnol. 33:657-663), the p^ promoter of phage X temperature and the inducible lambda 
repressor Clg,, (Pirrotta, 1975, Nature 254: 114-1 17; Petrenko et al, 1989, Gene 78:85-91), 
the trp promoter and trp repressor system (Bennett et al, 1976, Proc. Natl. Acad. Sci USA 
73:2351-55; Wame et al, 1986, Gene 46:103-1 12), the lacUVS promoter (Gilbert and 

10 Maxam, 1973, Proc. Natl. Acad. Sci. USA 70:1559-63), Ipp (Nokamura et al, 1982, J. Mol. 
Appl. Gen. 1:289-299), the T7 gene- 10 promoter, phoA (alkaline phosphatase), recA (Horii 
et al. 1980, Proc. Natl. Acad. Sci. USA and the tac promoter, a trp-lac fusion 

promoter, which is inducible by IPTG (Amann et al, 1983, Gene 25:167-78), for example, 
are all commonly used strong promoters, resulting in an accumulated level of about 1 to 

1 5 1 0% of total cellular protein for a protein whose level is controlled by each promoter. If a 
stronger promoter is desired, the tac promoter is approximately tenfold stronger than 
lacUVS, but will result in high baseline levels of expression, and should be used only when 
overexpression is required. If a weaker promoter is required, other bacterial promoters are 
well known in the art, for example, maltose, galactose, or other desirable promoter 

20 (sequences of such promoters are available from GenBank (Burks et al 1991, Nucl. Acids 
Res. 19:2227-2230). 

For eukaryotic systems, vectors will include eukaryotic-specific promoter regions, 
which include specific sequences that are sufficient for RNA polymerase recognition, 
binding and transcription initiation. Additionally, promoter regions include sequences that 

25 modulate the recognition, binding and transcription initiation activity of RNA polymerase. 
Such sequences may be cis acting or may be responsive to trans acting factors. Depending 
upon the nature of the regulation, promoters may be constitutive or regulated. Promoters 
that may be used to control Tnpl expression include, but are not limited to, the SV40 early 
promoter region (Benoist and Chambon, 1981, Nature 290:304-310), the promoter 

30 contained in the 3' long terminal repeat of Rous sarcoma virus (Yamamoto et al, 1980, Cell 
22:787-797), the herpes thymidine kinase promoter (Wagner et a/., 1981, Proc. Natl. Acad. 
Sci. U.S.A. 78:1441-1445), the regulatory sequences of the metallothionein gene (Brinster 
et al, 1982, Nature 296:39-42); plant expression vectors comprising the nopaline synthetase 
promoter region (Herrera-Esfrella et al, 1984, Nature 303:209-213) or the cauliflower 

35 mosaic virus 35S RNA promoter (Gardner et al, 1981, Nucl. Acids Res. 9:2871), and the 
promoter of the photosynthetic enzyme ribulose biphosphate carboxylase (Herrera-Estrella 
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et al, 1984, Nature 310:1 15-120); promoter elements from yeast or other fungi such as the 
Gal 4 promoter, or the ADC (alcohol dehydrogenase) promoter. 

Another option is to use a promoter that is tissue-specific, i.e., one whose expression 
is preferentially activated within a particular tissue and results in the expression of a gene 
5 product in the tissue where activated. Tissue-specific promoters that may be used include 
the PGK (phosphoglyceroyl kinase) promoter, alkaline phosphatase promoter, and the 
following animal transcriptional control regions, which exhibit tissue specificity and have 
been utilized in transgenic animals: elastase I gene control region which is active in 
pancreatic acinar cells (Swift et al, 1984, Cell 38:639-646; Omitz et al., 1986, Cold Spring 
10 Harbor Symp. Quant. Biol. 50:399-409; MacDonald, 1987, Hepatology 7:425-515); insulin 
gene control region which is active in pancreatic beta cells (Hanahan, 1985, Nature 
315:115-122), immunoglobulin gene control region which is active in lymphoid cells 
(Grosschedl et al, 1984, Cell 38:647-658; Adames et al, 1985, Nature 318:533-538; 
Alexander et al, 1987, Mol. Cell. Biol. 7:1436-1444), mouse mammary tumor virus control 
15 region which is active in testicular, breast, lymphoid and mast cells (Leder et al, 1986, Cell 
45:485-495), albumin gene control region which is active in liver (Pinkert et al, 1987, 
Genes Dev. 1:268-276), alpha-fetoprotein gene control region which is active in Uver 
(Krumlauf e/ al, 1985, Mol. Cell. Biol. 5:1639-1648; Hammer et al, 1987, Science 235:53- 
58); alpha 1-antitrypsin gene control region which is active in the liver (Kelsey et al, 1987, 
20 Genes Dev. 1:161-171), beta-globin gene control region which is active in myeloid cells 
(Magram et al, 1985, Nature 315:338-340; KoUias et al, 1986, Cell 46:89-94), myelin 
basic protein gene control region which is active in oligodendrocyte cells in the brain 
(Readhead et al, 1987, Cell 48:703-712), myosin light chain-2 gene control region which is 
active in skeletal muscle (Sani, 1985, Nature 314:283-286), and gonadotropic releasing 
25 hormone gene control region which is active in the hypothalamus (Mason et al, 1986, 
Science 234:1372-1378). 

Vectors that contain both a promoter and a cloning site into which a polynucleotide 
can be operatively linked are well known in the art. Such vectors are capable of 
transcribing RNA in vitro or in vivo, and are commercially available from sources such as 
30 Stratagene (La JoUa, Calif) and Promega Biotech (Madison, Wis.). In order to optimize 
expression and/or in vitro transcription, it may be necessary to remove, add or alter 5* and/or 
3' untranslated portions of the cloned DNA to eliminate extra, potential inappropriate 
alternative translation initiation codons or other sequences that may interfere with or reduce 
expression, either at the level of transcription or translation. Alternatively, consensus 
35 ribosome binding sites can be inserted immediately 5' of the start codon to enhance 
expression (see, e.g., Kozak, 1991, J. Biol. Chem. 266:19867). Similarly, alternative 
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codons, encoding the same amino acid, can be substituted for coding sequences in order to 
enhance translation (e.g., the codon preference of the host cell can be adopted, the presence 
of G-C rich domains can be reduced, and the like). 

The vector may also contain nucleotide sequences of interest for protein expression, 
5 manipulation or maintenance of the inserted target DNA. For example, promoter 
sequences, enhancer sequences, translation sequences such as Shine and Dalgamo 
sequences, transcription factor recognition sites, Kozak consensus sequences, and 
termination signals may be included, in the appropriate position in the vector. For 
recombination cloning in cells other than bacterial cells, such as plant, insect, yeast or 
10 mammalian cells, other sequence elements may be necessary, such as species-specific 

origins of replication, transcription, processing, and translation signals. Such elements may 
include, but are not limited to eukaryotic origins of replication, enhancers, transcription 
factor recognition sites, CAT boxes, or Pribnow boxes. 

Any method known in the art for delivering a DNA preparation comprising the target 
15 DNA into a host cell is suitable for use with the methods described above. Such methods 
are known in the art and include, but are not limited to electroporation of cells, preparing 
competent cells with calcium or rubidium chloride, and transduction of DNA with target 
DNA packaged in viral particles. For eukaryotic cells, methods include but are not limited 
to electroporation, transfection with calcium phosphate precipitation of DNA, and viral 
20 packaging. In a preferred embodiment, electroporation is used. Cells are treated to make 
them competent for electroporation by standard methods (see Ausubel et al.. Current 
Protocols in Molecular Biology, Greene Publishing Associates and Wiley Interscience, New 
York). Preferably, about 50 ^1 of a standard preparation of electrocompetent cells is used 
for electroporation by standard procedures. In experiments that require the transformation 
25 of a linear or circular vector, 0.3 ^g or more of vector is preferably used. In experiments 
that require the transformation of a DNA preparation containing the target DNA, 0.3 \ig or 
more is preferably used. For co-transformation experiments, the DNAs are preferably 
mixed before electroporation. After electroporation, the cells are preferably diluted in 
culture medium and incubated for an approximately 1 and a half hours recovery period 
30 before culturing under conditions to identify the phenotypic change conveyed by the 
selectable marker gene. 

Optimally, in prokaryotic cells, the phenotypic change is resistance to an antibiotic 
and the cells are cultured on plates that contain the corresponding antibiotic. In this case, 
the antibiotic resistant colonies that appear after overnight culture will predominantly 
35 contain the desired subcloning product. 
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In another embodiment, DNA is delivered into the host cell by transduction of DNA 
that has been packaged into a phage particle. PI or X transduction and packaging protocols 
are known in the art. Lambda packaging extracts are available commercially (e.g. , from 
Promega, Madison, WI). 

5 

5. 1 .4 CELLS USEFUL FOR TNPI-MEDL\TED SITE-SPECIFIC 

RFCOMBTNATION 

Compositions include prokaryotic and eukaryotic cells containing DNA molecules 
comprising TRT or TRT' sequences. Where genetic manipulations of nucleotide sequences 

10 of interest by recombination is desirable, it is useful to have ready-made cells available that 
can express Tnpl to achieve recombination events. In prokaryotic cells, this may be 
accomplished by the integration of an expression cassette, consisting of a Tnpl coding 
sequence operably linked to a promoter, into the host cell in such a manner that Tnpl 
production is effected. This may be accomplished, for example, by using homolgous 

15 recombination, transposon-mediated deUvery systems, or X integration to introduce the Tnpl 
coding sequence into the host chromosome. Bacterial strains may also be engineered that 
carry a Tnpl expression cassette on an episome or plasmid (see, e.g., Salamitou et al, 1997, 
Gene 202:121-26). Preferably, the promoter controlling Tnpl expression is inducible. 

The invention further relates to eukaryotic cells that contain at least one nucleotide 

20 sequence flanked by recombination targets, wherein the nucleotide sequence is excisable 
when Tnpl contacts the recombination targets. The nucleotide sequence can be such that, 
upon Tnpl-mediated recombination, the cell loses a gene function, gains a gene function, 
dies under certain conditions, or survives under certain conditions. The nucleotide sequence 
to be excised or inverted may be contained in an expression vector, a stable episome, or 

25 may be stably integrated into the genome of the host cell. 

5.1.5 KITS 

The invention also encompasses kits, comprising components useful for Tnpl- 
mediated recombination packaged into suitable containers. Such kits include vectors 

30 suitable for the expression of Tnpl in bacterial or eukaryotic cells, and may further include 
vectors suitable for the expression, and subsequent excision or inversion, of nucleotide 
sequences of interest, either by integration into the host genome or as episomes. Such kits 
may alternatively contain cells bearing DNA molecules with Tnpl recombination target- 
flanked nucleotide sequences, as well as recombinant Tnpl, or Tnpl expression vectors for 

35 production of Tnpl in vivo. 
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5.2 MFTHODS FOR USE OF TNPT-MEDIT ATFD RECOMBINATION 

The methods described herein relate to the use of Tnpl with the specific substrate 
sequences TRT and TRT' for Tnpl-mediated recombination. Such recombination can be 
used to delete or insert DNA sequences, resulting in modulation of gene expression. The 
5 methods may be used in vivo or in vitro for engineering prokaryotic or eukaryotic genes. 
Whereas recombination reactions with TRT substrates are fully reversible, recombination 
reactions utilizing TRT' are much less reversible. Thus, TRT sites may be used where 
reversibility of the reaction is desired. On the other hand, where greater stability is desired, 
stabiUzation of integration could be achieved using TRT (or a functional variant thereof) 
1 0 and transient expression of Tnpl. 

Generally, the methods described herein have the following common components: 1) 
a source of a functional Tnpl protein, such as purified or recombinant Tnpl recombinase or 
a cell or cell extract comprising comprising Tnpl activity; and 2) a substrate comprising 
TRT or TRT', or functional variants thereof, optionally flanking a target nucleotide 
1 5 sequence of interest. 

5.2.1 TN VIVO METHODS 

The invention encompasses in vivo methods for the use of the TnpI/TRT system for 
genetic engineering of prokaryotic or eukaryotic genes. In general, such methods utilize 

20 DNA replicons, such as, for example, DNA plasmid vectors or chromosomal DNA which 
are specifically designed to comprise Tnpl recombination target sites, e.g., TRT or TRT' 
sequences, located at positions where recombination is desired to take place. Cells 
expressing Tnpl activity, either constitutively or inducibly, are used as hosts for Tnpl- 
mediated site-specific recombination. Tnpl-mediated recombination may be used for 

25 deletion, insertion, or disruption of a gene of interest, or a particular domain of a gene of 
interest, or to make chromosomal alterations, such as large deletions and inversions, 
dupUcations and deletions by transallelic recombination, in inter- and intrachromosomal 
rearrangements (e.g., to alter antibody specificity or receptor specificity in particular cell 
types). Specific embodiments of these methods are described in detail hereinbelow. 

30 

5.2.1.1 TNSRRTIONAT. INTEGRATION OF DNA SEQUENCES 
In one aspect of the present invention, Tnpl-mediated recombination may be used to 
integrate new nucleotide sequences into a DNA molecule, such as an episome or genomic 
DNA. Targeted integration of DNA sequences into a eukaryotic or prokaryotic genomes 
35 may be used, for example, for expression of heterologous gene sequences in transgenic 
animals and plants. 
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In one embodiment, for example, this may be accomplished by a two step process 
which involves the following steps. First, Tnpl recombination target sites are introduced 
into the host genome, and second, a gene of interest is recombined into the genome at the 
Tnpl recombination target sites via Tnpl-mediated site-specific recombination. In a specific 

5 embodiment, insertion of one or more Tnpl recombination target sites into the host genome 
may be accomplished by constructing an integration vector comprising two Tnpl 
recombination target sites separated by a selectable marker. Preferably, the selectable 
marker is a negatively selectable marker, so that, in the second step, insertion of the gene 
sequence of interest can be selected by selecting against using the negatively selectable 

10 marker. Alternatively, naturally occurring Tnpl recombination target sites, if they exist, 
may be used. In the next step, a gene sequence of interest, flanked by Tnpl recombination 
target sites, is introduced into the host cell. Tnpl is then expressed in the cell, allowing 
recombination at the Tnpl recombination target sites, which results in insertion of the gene 
sequence of interest into the host genome between the Tnpl recombination target sites. 

15 

S 9 1 7 RKMOVAL OF DNA SEQUENCES 

Another aspect of the present invention is the use of Tnpl as a site-specific 
recombinase to delete nucleotide sequences of interest, such as, e.g., selectable markers or 
other vector sequences, to disrupt gene function on an episome or in genomic DNA. 

20 For example, in one embodiment, Tnpl-mediated deletion reactions may be used for 

targeted excision of DNA sequences. In one embodiment, Tnpl-mediated recombination 
may be used to delete a gene of interest firom the genome of a higher organism. Tnpl 
recombination target sites are inserted into the chomosome by homologous recombination 
flanking a gene of interest in direct repeat orientation. Expression of Tnpl in such cells 

25 results in recombination at the Tnpl recombination target sites and deletion of the gene of 
interest. 

In a specific embodiment, the TnpI/TRT system may be used to create conditional 
knockout mutations in vivo, for example, in mice, in order to delete a desired gene sequence 
at a particular time or in a particular tissue. Such conditional knock-out mutations have 

30 been successfixUy produced using other site-specific recombinases, such as the Cre/lox 
system (for reviews, see Metzger and Feil, 1999, Current Opinion in Biotechnology, 10: 
470-476; Sauer, 1998, Methods 14: 381-392). To this end, two directly repeated Tnpl 
recombination target sites may be inserted into the genome by homologous and site-specific 
recombination in ES cells, such that the Tnpl recombination target sites flank a gene of 

35 interest. Mice carrying such a "conditional" allele are crossed with Tnpl-expressing 
transgenic mice. In the progeny, recombination will result in deletion of the gene of 
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interest. Where the Tnpl-expressing mice used are engineered to express Tnpl in an 
inducible, developmental, or tissue specific manner, these methods may be used to induce 
spatio-temporally controlled genetic alterations in vivo. Thus, stabilization of integration 
may be achieved using TRT and transient expression of Tnpl. 
5 In another embodiment, Tnpl-mediated recombination may be used to remove 

unnecessary sequences fi-om a DNA molecule after a genetic engineering step. - For 
example, when a heterologous nucleotide sequence is integrated into a genome or an 
episome, extraneous sequences are often co-integrated, such as vector sequences required 
for replication, maintenance, or selection of the integration vector. After insertion of the 
10 DNA sequence into the genome, such sequences are no longer necessary. Using the 

TnpI/TRT system of the present invention, such sequences may be removed. In a specific 
embodiment, for example, a mammaUan cell line is transformed, preferably by 
electroporation, with a plasmid that contains a gene of interest and a selectable marker. The 
selectable marker is flanked by either TRT or, preferably, TRT' in the same orientation, 
15 creating a TRT'-selectable marker cassette that is operably separate fi-om the gene of 
interest. Once stable integration is accomplished, the host cell line is transfected with a 
vector containing a Tnpl-coding sequence operably linked to a promoter suitable for 
y transient expression of the Tnpl gene. Upon expression of Tnpl, the selectable marker is 

excised fi-om the genomic DNA, leaving the gene of interest intact, 
''p^ 20 These methods are usefiil, for example, for the purposes of gene therapy or any 

f] application aiming at producing food-grade genetically modified organisms (GMOs). In 

% these applications it may be desirable to introduce into a genome a particular fimctionally- 

B modifying gene, without the associated vector sequences that were required for maintenance 

of a vector in a bacterial host. Such sequences may re removed using the Tnpl-mediated 
25 deletion reaction described herein. In another embodiment, marker sequences used for 
selection of recombinants may be removed. Specifically, a vector comprising: 1) a 
eukaryotic gene of interest, 2) a bacterial selectable marker, and 3) origin of replication is 
used to transform plant cells, by homologous recombination of the gene into the cellular 
DNA. For genetically-modified plants, the eukaryotic gene is desirable, but the origin of 
30 replication and selectable marker, which in the field may be transferred horizontally to 
sensitive bacteria, are particularly undesirable. To remove these unwanted sequences after 
introducing the gene of interest, a plasmid vector is constructed comprising: (1) a gene 
segment to be integrated into the plant cells' genomic DNA; (2) a bacterial host-relevant 
selectable marker, and (3) two Tnpl recombination target sites that flank the selectable 
35 marker and origin of replication in such a manner that, when recognized by Tnpl, the 

bacterial-relevant genes are excised. The vector is linearized and transformed into the new 



-23 - 



NY2- 1109748.6 



host cells. Once integration and expression of the desired gene has occurred, the bacterial- 
relevant marker and origin can be removed by Tnpl-mediated recombination, as described 
above. 

Tnpl-mediated recombination in vivo can also be used to disrupt gene function. This 
5 can be accomplished by inserting one or more Tnpl recombination target sites in or adjacent 
to an endogenous gene of interest, whose function is desired to be disrupted. Tnpl is then 
expressed in the cell, resulting in recombination between the Tnpl recombination target 
sites, a concomitant loss of gene function. Preferably, Tnpl is expressed by induction of 
Tnpl residing on a plasmid under the control of an inducible promoter. Preferably, the 
10 plasmid contains an appropriate origin of replication conferring medium copy number to 
high copy number, and a gene that confers resistance to an antibiotic to which the host is 
sensitive. 

In a specific embodiment, for example, a prokaryotic host capable of expressing 
Tnpl as above is used. The second plasmid contains, in order, a selectable marker, a 

1 5 recombination target, a second selectable marker oriented so as to be transcribed in a 

direction opposite to that of the first selectable marker, a second recombination target, and a 
promoter in the same orientation as the first selectable marker and capable of driving 
expression of the first selectable marker upon recombination. Preferably the selectable 
markers confer antibiotic resistance, but may also confer prototrophy to the appropriate 

20 auxotrophic strain. Prior to recombination, only the second selectable marker is expressed. 
Upon expression of Tnpl, however, the second selectable marker is excised and lost; the 
excision joins the promoter and fu-st selectable marker, which is then expressed. 

5.2.1.3 INDUCTION OF GENE E XPRESSION 
25 In another aspect of the invention, Tnpl-mediated site specific recombination may be 

used to modulate, i.e., increase or decrease, the expression of specific genes within a host 
cell. In one embodiment, a vector is constructed comprising a promoter, a gene of interest, 
and a spacer sequence separating the promoter and the gene of interest such that the 
promoter is not operably associated with the gene and no expression occurs. The spacer 

30 sequence is flanked by two directly repeated Tnpl recombination target sites. When Tnpl is 
expressed in the cell, recombination between the target sites results in excision of the 
spacer. This results in the promoter becoming operably associated with the gene of interest, 
inducing in gene expression. If desired, the spacer region may comprise a selectable 
marker, preferably a negative selectable marker, so that removal of the spacer region may be 

35 selected for. In this embodiment, Tnpl expression can be accomplished in a number of 
different ways, including introducing an expression vector that expresses Tnpl, in an 
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inducible or constitutive manner. For example, to induce Tnpl expression, a second vector 
comprising a Tnpl coding sequence operably linked to an inducible or constitutive promoter 
may be introduced into the host cell. 

In another embodiment, Tnpl-mediated recombination is used to control the 

5 expression of a target gene by regulating the expression of a regulatory molecule, such as, 
for example, a transcription factor which induces expression of a gene, or an inhibitory 
factor to repress the expression of the gene. A vector containing the DNA sequence of the 
regulatory molecule flanked by recombination sites is introduced into a eukaiyotic cell. A 
second DNA sequence comprising an inducible Tnpl gene is also introduced into the cell. 

10 Upon activation of the Tnpl gene, the gene for the regulatory molecule is deleted, and the 
target gene becomes activated or inactivated. 

In another embodiment, the TnpI/TRT system may be used to for in vivo expression 
technology to identify transient expression of genes in eukaryotic and Gram-negative 
bacterial cells in response to particular growth conditions or other environmental signals. 

15 This method is described by Salimitou et al. (Salamitou et al, 1997, supra), but is limited 
to use in Gram-positive bacteria. In this method, two compatible plasmids are constructed, 
a Tnpl expression plasmid and a reporter plasmid. The expression plasmid is constructed 
with a promoteriess Tnpl gene, into which gene sequences, e.g. sequences derived from a 
genomic library, may be inserted. Activation of a promoter cloned upstream from the 

20 promoteriess Tnpl gene results in expression of Tnpl, which, in turn, mediates 

recombination between the two Tnpl recombination target sites, leading to excision of the 
selectable marker. A second plasmid, the reporter plasmid, which is compatible with the 
first plasmid, comprises one or more selectable marker sequences and two directly repeated 
Tnpl recombination target sites. The reporter plasmid may be set up to accommodate either 

25 positive or negative selection system, or both. Where a negative selection is desired, the 
reporter plasmid is engineered so that Tnpl recombination target sites flank the negatively 
selectable marker in direct repeat orientation, such that Tnpl expression leads to excision of 
the marker. Thus, recombinant cells can be selected for by selecting against the negatively 
selectable marker. Alternatively, or in addition to the negatively selectable marker, a 

30 positive selection system may be used. In this case, the reporter plasmid is engmeered so 
that excision of the sequences between the Tnpl recombination target sites results in the 
generation of a fimctional positive selectable marker. In this method, the Tnpl 
recombination target site is preferably TRT', which favors the intramolecular excision 
reaction. This system may be used to screen eukaryotic and prokaryotic genomic libraries 
35 for nucleotide sequences which respond to environmental signals, allowing the 
identification of genes involved in important signaling pathways. 
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5.2.2 TN VITRO METHODS 

The invention also encompasses in vitro methods for genetic engineering of 
prokaryotic or eukaryotic genes. In these methods, a purified or isolated Tnpl activity is 
supplied to a substrate DNA molecule comprising TRT or TRT' sites. Preferably, purified 
5 recombinant Tnpl is used. The substrate DNA molecule used in these in vitro methods 
comprises a DNA molecule containing two directly repeated TRT or TRT' sites, between 
which sites resides a target nucleotide sequence whose rearrangement is desired, i.e., the 
target sequence to be deleted or inserted into a second substrate molecule. The substrate is 
preferably a supercoiled DNA molecule. 

10 In one embodiment, for example, in vitro Tnpl recombination reactions are 

performed as follows. A supercoiled DNA plasmids containing two copies a functional 
Tnpl recombination target site sequence is used as a recombination substrate. 
Recombination reactions are performed in 20 ^1 volume using approximately 500 ng of 
Tnpl recombinase protein and DNA substrate in a buffer containing 50 mM Tris-HCl pH 

15 8.0, 25mM KCL, 1.25mM EDTA, 5mM spermidine, 10% glycerol and 25 ^g/Ml BSA, and 
incubated at 37 °C for different times. In a preferred embodiment the incubation time is 30 
minutes. Preferably, the Tnpl recombinase protein is purified. The recombination products 
may be analysed by restriction enzyme digestion followed by agarose gel electrophoresis of 
the reactions. 

20 In various embodiments, for example, the in vitro Tnpl recombination reaction is 

used for the construction and manipulation of recombinant DNA molecules in vitro. Such 
methods have been described for use with other site-specific recombinases, such as the 
Cre/lox system (see U.S. Patent No. 5,851,808, issued Dec. 22, 1998; International 
Publication No. WO 00/05355, published Feb. 3, 2000; Liu et al., 2000, Methods in 

25 Enzymology 328: p530-549). Using the TnpI/TRT system in vitro, one can rearrange, 
shuffle, and subclone DNA molecules fi-om one type of vector into another, without the 
need for costly restriction enzymes, ligases and time-consuming and labor intensive in vitro 
manipulations. For example, in one embodiment, the TpnimiT system may be used to 
subclone or "shuttle" a nucleotide sequence (or library of nucleotide sequences) of interest 

30 from a fu-st DNA molecule (a "donor" vector) into a second DNA molecule (a "recipient" 
vector), which recipient vector is particularly suited for a desired purpose, such as 
expression in a particular host cell type. The donor vector comprises a Tnpl recombination 
target site, the gene of interest, an origin of repUcation, and, optionally, one or more 
selectable marker genes. The recipient molecule comprises one or more nucleotide 

35 regulatory sequences juxtaposed to a Tnpl recombination target, an origin of replication, 
and, optionally, one or more selectable marker genes. The nucleotide regulatory sequence 
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may be any sequence that may modify, promote or enhance expression of the gene interest 
in the host cell. Examples of such nucleotide regulatory sequences include, but are not 
limited to, eukaryotic or prokaryotic transcriptional regulatory sequences such as promoters, 
enhancer elements, and/or transcription factor binding sites, such as cell-type specific or 

5 tissue-specific transcription factor binding sites, translational regulatory elements, and/or 
post-translational processing sites. Preferably, in this embodiment, where unconstrained 
recombination activity is desirable, the Tnpl recombination target site is a TRT sequence. 

In another embodiment, for example, the in vitro deletion reaction between Tnpl 
recombination target sites may be used to produce circular DNA molecules devoid of 

10 undesired nucleotide elements. Such a method may be used, for example, in the 

construction of DNA vaccines, where it would be desirable to remove unnecessary vector 
sequences fi-om a DNA molecule before administration of the DNA vaccine to a subject. 
For example, such a vector may be constructed by placing the DNA sequence of interest 
between two directly repeated Tnpl recombination target sites, which are later usefiil for 

1 5 removing the replication origin and resistance marker genes, producing a fmal DNA vaccine 
product usefiil for therapeutic treatment. Preferably, in this embodiment, where the 
constrained recombination activity may be desirable, the Tnpl recombination target is a 
TRT' sequence. 

Finally, the TnpI/TRT system may be used in conjunction with other site specific 
20 recombination systems with different specificities, e.g., Cre/lox or Flp/FRT recombinases, 
to expand the versatility of the in vitro methods (Meyers et al., 1998, Nat. Genet. 18:136- 
41). For example, plasmid vectors comprising both Tnpl recombination target sites 
together with one or more other site-specific recombinase recognition sites could be used to 
delete, insert, and/or rearrange DNA sequences at particular steps of a multi-step cloning 
25 process. Such methods may be particular usefiil, for example, in the fields of proteomics 
and directed evolution to construct novel proteins with novel activities which require for 
multi-step shuffling methods. 

6. EXAMPLES 

30 The invention is illustrated by the following experiments, which analyze the site- 

specific recombinase activity of Tnpl both in vivo and in vitro. 

The examples presented hereinbelow demonstrate that Tnpl mediates recombination, 
both in vivo and in vitro, using two distinct target sites, the minimal TRT site and the fiiU 
TRT' site, and that Tnpl activity at these two distinct sites leads to topologically distinct 

35 intermediates and fimctionally distinct recombinant products. The experiments fiirther 

demonstrate that Tnpl is active in a broad range of host cells, without the need for accessory 
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factors. As demonstrated herein, these previously unknown abilities of Tnpl and its 
recombination substrates have important implications for the use of Tnpl recombinase in 
genetic engineering. 

5 6.1 ANALYSIS OF TNPl-MEDTATED RECO MBTNATION IN VIVO 

One of the limitations of use of other well known recombinases, such as Cre/lox or 
Flp/FRT, is the complete reversibility of their reaction mechanisms. Consequently, the rate 
of production of final product is impeded by the rate at which product undergoes the reverse 
reaction to reform the substrate (Logic and Stewart, 1995, Proc. Natl. Acad. Sci. USA 92: 

10 5940-5944). The experiments in this section demonstrate that the reversibility of Tnpl- 
mediated site-specific recombination depends on the nature and structure of the 
recombination target site. Thus, substrates can be specifically designed according to the 
desired reversibility of the excision/insertion reaction. 

15 6.1.1 DIFFERENTIAL ACTIYITIFS OF TRT' AN D TRT STTBSTRATE S 

Previous reports identified a 249 bp firagment of transposon Tn4430 as the shortest 

DNA fragment that includes the sequences required for Tnpl recombination (See Figure 2; 

Salamitou et al, 1997, supra; Sanchis et al., 1997, supra). As depicted in Figure 2, this 

region contains three distinct Tnpl binding sites: a 32 bp inverted repeat sequence, and two 
20 downstream 16bp direct repeat sequences. To analyze the requirements for Tnpl-mediated 

recombination, Tnpl-mediated recombination was analyzed in the heterologous host E. coli. 

For heterologous expression, an inducible Tnpl expression vector, pYZ-B AD-TnpI, 
was constructed. pYZ-BAD-TnpI was created by inserting the coding sequence for Tnpl, 

25 shown in Figure 3, into the vector pBAD24, such that the expression of Tnpl was under the 
control of the arabinose-inducible promoter, Pb^d- The resultant vector, pYZ-BAD-TnpI. as 
shown in Figure 4, comprises Pbad operatively linked to the Tnpl coding sequence, a pi 5 A 
repUcation origin, a kanamycin resistance selectable marker, and AraC, the arabinose- 
responsive repressor that regulates Pb^d- (In addition, pB AD expression vectors have been 

30 constiiicted which express Tnpl as a His-tag fiision protein. Both forms of protein were 
found to be equally active, both in vivo and in vitro.) 

Initially, Tnpl-mediated recombination was tested in E. coli using a "fixU-length" 
Tnpl recombination target sequence. Two copies of a 244 bp fragment comprising the left 
IR and all four Tnpl binding sites, called TRT" (see Figure 1 A), were cloned into a plasmid 

35 in direct repeat orientation and used as a substrate for recombination in E. coli. This 
plasmid construct was then transformed into E. coli cells containing the Tnpl expression 
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plasmid pYZ-BAD-TnpI, and induced with arabinose. Using TRT", recombination was 
detected upon induction of Tnpl, indicating that the heterologous host E. coli supports Tnpl- 

mediated recombination. 

To further refine the sequence requirements of Tnpl-mediated recombination, 
5 various substrates were tested for recombination activity in E. coli cells expressing Tnpl. 
DNA plasmids containing directly repeated copies of Tnpl recombination target sites of 
various lengths were tested as substrates. In particular, plasmids were constructed 
containing directly repeated copies of either the abbreviated 1 16bp Tnpl recombination 
target site, TRT' (shown in Figure IB), containing both the 32 bp inverted repeat sequence 
10 and the two downstream 16bp direct repeat sequences, or the shorter 32 bp Tnpl 

recombination target site, TRT (shown in Figure IC), containing only the 32 bp inverted 
repeat sequences. Plasmids containing TRT or TRT' sites were introduced into K coli cells 
in the presence or absence of a Tnpl expression plasmid. Plasmid products were then 
purified fi-om the cells and separated on agarose gels containing ethidium bromide. 
1 5 The result is shown in Figure 5 A-B. As depicted in Figure 5 A, complete 

recombination between a substrate, S, containing the two recombination sites produces two 
recombination products, P, one of which lacks a plasmid origin (indicated by a filled circle) 
and is therefore unable to replicate in vivo. Products of Tnpl-mediated recombination in E. 
coli are shown in the agarose gel depicted in Figure 5B. Plasmids containing TRT or TRT' 
20 sites were tested in the presence of Tnpl expression. The results of this experiment indicate 
that DNA plasmids containing directly repeated 1 16 bp TRT' sites yielded the expected 
recombination product in the presence of Tnpl expression (lane 5), but not in the absence of 
Tnpl expression (lane 4). However, using the DNA plasmid comprising the minimal TRT 
(lanes 6 and 7), an unexpected result was obtained. Rather than the lower-molecular weight 
25 band representing the expected product, a ladder of multiple higher-molecular weight bands 
was observed upon induction of Tnpl expression (lane 7). This ladder of high molecular 
weight bands represents a series of plasmid multimers, the products of intermolecular 
recombination, rather than deletion. A small amount of the expected lower molecular 
weight product is observed seen at the bottom of the ladder. 
30 These unexpected results mdicate that the 1 1 6 bp TRT and the shorter 32 bp TRT 

exhibit functionally distinct properties as Tnpl recombination substrates. Apparently, when 
TRT' is used a substrate, the intramolecular (forward) reaction, which yields the two 
products, is topologically different firom the intermolecular (backward) reaction, which 
recreates the starting material DNA substrate. This may be explained by a difference in the 
35 topological states of the TnpI-TRT complex versus TnpI-TRT' protein complexes. The 
TRT' complex may provide an asymmetric component so that the intermolecular reaction is 
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favored over the intramolecular reaction. This topological component is discussed further 
below (see Section 6.2.3 and Figure 9 below). 

In sum, the use of alternative, distinct Tnpl recombination target sites allows options 
for use in genetic engineering. Thus, the use of the full TRT' favors the forward, 
5 intramolecular excision reaction, and disfavors the reverse, intermolecular insertion 
reaction. On the other hand, the use of the full length TRT' can be used when the 
irreversible intramolecular excision reaction is desired. 

10 6.2 ANALYSIS QF TNPI-MEDT ATKD RECOMBINATION IN VITRO 

The following in vitro recombination experiments were performed to identify the 

functional domains and in vitro requirements of Tnpl-mediated recombination. 

Importantly, it was found that the purified Tnpl is sufficient to promote recombination at 

the TRT and TRT' sites, albeit with different efficiencies, and that additional auxiliary 
15 factors are not required. Topological analysis of the in vitro recombination products also 

revealed that recombination at TRT and TRT' involves the formation of recombination 

complexes with different level of complexity. 

6.2.1 TNPT OVERPXPRESSION AND PURI FICATION 

20 To overexpress the Tnpl protein in E. coli, a Bgai-Hindm DNA fragment carrying 

the Tnpl coding sequence was ampUfied by PGR and inserted into the pBAD/HisA 
expression vector (Invitrogen). As depicted in Figure 6A (upper panel), the resulting 
plasmid, pGIV007, encodes a His-tag::TnpI fusion protein, H-Tnpl, the expression of which 
is under the control of the E. coli arabinose operon promoter, pBAD. An Ncol deletion of 

25 pGIVOO? was constructed in order to delete the His-tag region and to generate the plasmid 
pGIVOOS, which expresses the wild-type Tnpl protein (Figure 6A, lower panel). Cultures 
ofE. coli cells containing these two plasmids were induced with 0.2 % L-arabinose, and the 
expressed proteins were purified (H-Tnpl) or partially purified (Tnpl) by affinity 
chromatography on nickel resin, and displayed by SDS PAGE gel electrophoresis as shown 

30 in Figure 6B. 

6.2.2 DIFFERENTIAL ACTIVITY OF TRT' AN D TRT IN VITRO 

The recombination activity of Tnpl and H-Tnpl was examined in vitro. DNA 
plasmids containing either two directly repeated copies of the TRT site (pGIV016) or two 
35 directly repeated copies of the TRT' site (pGIV014) were used as recombination substrates. 
As depicted schematically in Figure 7A, complete recombination between the two TRT or 
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TRT' copies results a deletion reaction, producing two smaller plasmids, each with a single 
Tnpl recombination target site. Recombination reactions were performed with ~ 500 ng of 
purified recombinase protein and supercoiled DNA in a volume of 20 jil, containing 50 mM 
Tris-HCl pH 8.0, 25mM KCL, 1.25mM EDTA, 5mM spermidine, 10% glycerol and 25 

5 ng/MlBSA. Reactions were incubated at 37°C for 30 min. 

The recombination reactions were then analyzed by the restriction enzyme digestion, 
followed by electrophoresis in a 0.8 % agarose after digestion with HindlU. Since the 
starting plasmids each contained two Hindm sites, digestion with Hincan yields two 
fi-agments, SI and S2 (see Figure 7 A). However, Hindm digestion of the recombination 

10 products, each of which contains a single Hindm site, yields two distinguishable firagments, 
PI and P2 (see Figure 7A). Thus, Hindm digestion readily distinguished the starting 
material from the two products. As shovm in the agarose gels in Figure 7B, Hindm 
digestion of the recombination reactions resuhed in the expected size fragments 
corresponding to starting material and recombination products using both TRT and TRT as 

1 5 subsfrates. Thus, these results demonstrated that the purified Tnpl protein is sufficient to 
promote efficient recombination at TRT and TRT' in vitro, without requiring additional host 
factors. 

Next, a kinetic time-course analysis of Tnpl-mediated recombination in vitro 
reactions was performend using TRT and TRT'. Figure 7C shows the results of a time 

20 course analysis of recombination at TRT (squares) and TRT' (diamonds). For each time 
point, the proportion of the products was quantified by densitometry analysis of an 
electrophoresis gel containing Hindlll-digested reactions. This analysis revealed that 
recombination of a plasmid carrying two directly repeated TRT' sequences is faster than 
recombination of a similar plasmid containing the TRT core sequence, indicating that the 

25 additional Tnpl binding sites in TRT' stimulated recombination. Similar recombination 
activity was observed for the wild-type and His-tagged Tnpl proteins, demonstrating that 
the recombinase tolerates N-terminal peptide fixsions. 



6 2.3 RECOMBINATION AT TRT AND TRT' INVOLVES THE 
FORMATION OF RECOMBINATION COMPLEXES WITH 
DIFFERENT T KVKLS OF COMPLEXITY 

The topology of i« vitro recombination products was analyzed by ti-eating the 

reactions with DNase I in the presence of ethidium bromide. This treatinent inti-oduces 

single nicks in DNA, allowing removal of supercoiling while perserving the intercatenation 

nodes introduced into the substi-ate during recombination (see Figure 8A). Plasmid 

substrates were incubated in the presence or absence of H-Tnpl, and then treated in 

reactions with 1 ng/ml of DNase I (+) and 0.3 mg/ml of ethidium bromide, or without 
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DNase I (-), as indicated. Nicked reactions were then analyzed by high resolution agarose 
gel electrophoresis (0.7% agarose) to compare the complexity of the products resulting from 
TRT or TRT' recombination. 

As shown in Figure 8A, intramolecular recombination between directly repeated 

5 TRT sequences generated a variety of deletion products (catenanes) in which the two DNA 
circles remained interlinked a certain number (i.e., 2, 4, 6, etc.; labeled 2, 4, and 6 in Figure 
8 A) of times. Recombination also produced faint bands corresponding to knotted DNA 
molecules containing 3, 5, etc. nodes (labeled 3 and 5 in Figure 8A). Such knots arose from 
consecutive recombination events using the initial catenated products as a substrate. These 

10 results indicate that TRT recombination occurs freely by random coUision between 
recombination sites. 

In contrast, the TRT' recombination substrate yielded exclusively 2-noded catenane 
products (Figure 8B). This result indicated that recombination is catalyzed within a protein- 
DNA complex having a specific geometry. Assembly of the recombination complex may 

1 5 involve multiple interactions between Tnpl molecules boxmd onto the inverted and direct 
repeats of TRT'. According to the current model, depicted in Figure 9, the topology of the 
DNA within this complex is such that it can only form efficiently if the two recombination 
sites are present in tandem on a same DNA molecule, thereby providing directionality to the 
recombination reaction. When Tnpl is also bound to the direct repeats (DRl, DR2), a 

20 specific higher order wrapping is created in addition to the one formed when Tnpl is bound 
to TRT. 

6.3 TNPI-MEDIATRD RECOMBINATION IN EUKARY OTIC CELLS 

The experiments in this section demonstrate the ability of Tnpl to mediate site- 

25 specific recombination in eukaiyotic cells. A eukaryotic Tnpl recombination reporter cell 
line was constructed by introducing pSVpaT, depicted in Figure 10, into mouse ES cells. 
As shown in Figure 10, pSVpaT contains an SV40 promoter, a puromycin resistance gene 
flanked by two direct copies of a 32 bp TRT sequence containing the inverted repeat, the P- 
galactosidase gene (JacZ ), and the chloramphenicol resistance gene. 

30 The nucleotide sequence encoding Tnpl was cloned into the eukaryotic expression 

plasmid pCAGGS to construct the eukaryotic Tnpl expression plasmid pCAGGS/TnpI, 
which is depicted in Figure 11. pCAGGS includes the Tnpl coding sequence, the 
expression of which is driven by the chicken beta-actin promoter Pcba- 

ES cells containing stably integrated pS VpaT sequences were selected for by 

35 selecting for puromycin resistance. Before Tnpl -mediated recombination, these Tnpl 
reporter ES cells do not express LacZ (see Figure 12, top). However, Tnpl-mediated 
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recombination removes the puromycin resistance gene, which lies between the SV40 
promoter and lacZ, and juxtaposes the lacZ gene next to the SV40 promoter (see Figure 12, 
bottom). Thus, recombination results in the expression of /acZ, which can be detected by 
the appearance of blue colonies upon the introduction of lactose. 
5 Figure 1 3 shows the results of this experiment. Introduction of the pCAGGS/TnpI 

expression vector into the Tnpl reporter ES cells resulted in the expression of LacZ, as 
indicated by the presence of blue colonies, shown in Figure 13. 

This previously unreported ability of Tnpl to catalyze site-specific recombination in 
eukaryotic hosts expands the commercial potential of Tnpl for genetic engineering. 

10 

The invention described and claimed herein is not to be limited in scope by the 
specific embodiments herein disclosed since these embodiments are intended as illustration 
of several aspects of the invention. Any equivalent embodiments are intended to be within 
the scope of this invention. Indeed, various modifications of the invention in addition to 
1 5 those shown and described herein will become apparent to those skilled in the art fi-om the 
foregoing description. Such modifications are also intended to fall within the scope of the 
appended claims. Throughout this application various references are cited, including patent 
applications and publications, the contents of each of which is hereby incorporated by 
reference into the present application in its entirety for all purposes. 

20 



25 



30 



35 



-33- 



NY2- 1109748.6 



